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Circular proteins have been identified in plants (cyclotides), 
fungi (amatoxins/phallatoxins), bacteria (bacteriocins/pilins) 
and animals (defensins), where they are commonly involved in 
host defence against pests and pathogensJ^^ Their pesticidal 
and antimicrobial activity, along with the high stability and bio- 
activity provided by a cyclic backbone, makes them valuable 
agricultural and pharmaceutical drug lead molecules, and ne- 
cessitates the development of efficient methods for circular- 
protein production. Biosynthetic pathways usually involve mul- 
tistep post-translational modification of propeptides by proc- 
essing enzymes.^^^ Direct isolation of endogenous circular pro- 
tein is feasible in many instances^^^ but usually requires large 
quantities of source biomaterial 
and does not lend itself to the 
introduction of amino acid sub- 
stitutions. At present, circular 
proteins are most commonly ob- 
tained through use of chemical 
and/or biological syntheses. Es- 
tablished strategies for circular 
protein synthesis include the as- 
sembly of a linear precursor pep- 
tide through solid-phase peptide 
synthesis (SPPS), typically with 
inclusion of an N-terminal cys- 
teine and a C-terminal thioester 
enabling cyclization through 
native chemical ligation (NCL).^"^^ 
Alternatively, recombinant circu- 
lar proteins have been obtained 
through bacterial expression of 
intein fusion proteins, which 
afford circular proteins through 
expressed-protein ligation or protein trans-splicing,^^^ 
sortase, a bacterial transpeptidase.^^^ 

A single C-terminal cysteine is capable of initiating an intra- 
molecular N^S acyl shift in native peptide sequences.^^^ This 
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gives rise to an S-acyl intermediate that can be intercepted by 
an added thiol, causing cleavage of the protein backbone, lib- 
erating cysteine and yielding a C-terminal thioester. This reac- 
tion can be selective, depending on the nature of the amino 
acid preceding cysteine, and occurs at a significantly slower 
rate when cysteine is not positioned at the C terminus.^^^ Fur- 
thermore the presence of cysteine at the N terminus leads to 
spontaneous backbone cyclization, through intramolecular 
transthioesterification, and S^N rearrangement forming a pep- 
tide bond (Scheme 1).^^^ 

The cysteine-rich nature of many circular peptides facilitates 
the design of sequences containing these necessary labile 
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Scheme 1. Xaa-Cys motifs undergo N^S acyl shift upon heating and the S-acyl intermediate is ultimately inter- 
cepted by an N-terminal cysteine, yielding a cyclic thioester which rearranges to form a peptide bond. 
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sites. This strategy lends itself to the chemical synthesis of cir- 
cular peptides however, the genetic encodability of the nec- 
essary labile sites should also enable application to recombi- 
nant proteins. Bacterial peptide/protein production has several 
advantages over chemical synthesis; providing efficient access 
to large polypeptides that are beyond the scope of SPPS, at 
relatively low cost, and facilitating the production of combina- 
torial peptide libraries, through straightforward DNA engineer- 

Here we demonstrate the applicability of this strategy to 
recombinant proteins using the plant cyclotide kalata B1 (KB1). 
Cyclotides are perhaps the most comprehensively studied 
family of circular proteins to date, with hundreds of members 
identified and numerous 3D structures available.^^^ They con- 
tain 28-37 amino acids, including six conserved cysteine resi- 
dues that form three disulfide bonds, giving rise to a rigid. 
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characteristic "cyclic cystine knot" structure, which is extremely 
stableJ"^ These cysteines aside, the cyclotide family exhibits 
high sequence variability, and consequently cyclotides have 
emerged as valuable protein engineering scaffolds for the in- 
corporation and stabilisation of bioactive peptide sequencesJ^^^ 
KB1, from the tropical herb Oldenlandia affinis, was the first cy- 
clotide to be identified and is considered to be the prototypic 
family memberJ^^^ 

Although synthesis of a 30-residue peptide is within the 
bounds of SPPS, the often inefficent production of cysteine- 
rich peptide thioesters employing 9-fluorenylmethyloxycarbon- 
yl (Fmoc) SPPS chemistry means that less desirable ferf-butyl- 
oxycarbonyl (Boc) SPPS chemistry (which generally requires 
a potentially hazardous HF resin cleavage step) is typically em- 
ployedJ^"^^ This synthetic limitation enhances the requirement 
for alternative recombinant-based strategies for cyclotide pro- 
duction. 

Bacterial expression of a recombinant kalata B1 linear pre- 
cursor peptide was facilitated through fusion with an N-termi- 
nal thioredoxin (Trx) tag, yielding approximately 60 mg of puri- 
fied protein per litre of cell culture. Conveniently, one of the 



six cysteines of wild-type KB1 is preceded by glycine. These se- 
quential glycine and cysteine residues were therefore designat- 
ed as the respective C- and N-terminal KB1 residues in the ge- 
netically encoded sequence, and a seventh cysteine was ap- 
pended to the C terminus to provide a labile Gly-Cys site for 
N^S acyl shift. The His-tagged Trx-KB1 fusion protein was pu- 
rified through immobilized Ni^^-affinity chromatography (Fig- 
ure 1 A and Figure S1 in the Supporting Information), and the 
linear KB1 peptide was liberated initially through factor Xa pro- 
tease (Figures S2 and S3), yielding linear KB1 with an N-termi- 
nal cysteine. Subsequently more efficient liberation was ach- 
ieved by employing tobacco etch virus (TEV) protease (Fig- 
ure 1 B). The desired KB1 peptide was purified through re- 
versed phase-high performance liquid chromatography (RP- 
HPLC) in an unoptimised yield of 35%, based on the Trx-fusion 
precursor (Figure 1 C), and characterized by mass spectrometry 
(Figure 1 D, left). 

Subsequent in vitro cyclization of the purified linear KB1 
peptide was achieved through incubation at 45 °C in the pres- 
ence of 10% (w/v) sodium 2-mercaptoethane sulfonate 
(MESNa). The reaction proceeded for 48 h, after which the 
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Figure 1. Production of KBl. A) SDS-PAGE analysis of Ni^+-affinity-purified Trx-KBI fusion protein. M: molecular weight markers. Lane 1: whole-cell lysate, 
lane 2: soluble fraction, lane 3: insoluble fraction, lane 4: column flow-through, lane 5: column wash (5 mM imidazole), lane 6: column wash (20 mM imida- 
zole), lanes 7-11: eluted fractions (40-500 mM imidazole). B) TEV protease digestion of the fusion protein shows accumulation of Trx (released linear KBl is 
not visible on the gel). C) Preparative HPLC allows straightforward separation of the released KBl from Trx. D) Analytical HPLC (lower panel) and MS (upper 
panels) characterization of purified linear, cyclic (reduced) and folded KBl samples. E) HPLC coelution experiment: KBl (upper panel), native KBl (isolated 
from 0. affinis, middle panel) and a 1:1 mixture of each peptide (lower panel). 
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linear starting material was almost entirely consumed (Fig- 
ure S4). The observed drop in molecular mass of approximately 
121 Da in the cyclic product was consistent with the excision 
of the C-terminal cysteine residue and backbone cyclization 
(Figure 1 D, middle panel). In order to demonstrate that we 
had produced the correct circular framework, oxidative folding 
was performed, as previously described,^^^^ and confirmed by 
a further loss of six mass units (Figure 1 D, right). We also ob- 
served a characteristic increase in HPLC retention time on a 
reversed-phase chromatography column following cyclization 
and oxidation (Figure 1 D lower panel), due to the exposure of 
surface hydrophobic residues in the folded structure,^^^ thus 
indicating the presence of natively folded KB1. Over 1 mg of 
purified folded material was obtained from 2 L cell culture. Fur- 
thermore, we carried out analytical coelution RP-HPLC of KB1 
in the presence of native KB1 (nKBI) and observed elution of 
a single peak, representative of a homogeneous sample (Fig- 
ure 1 E). Nonetheless, we sought further confirmation of KB1 
structural integrity through NMR spectroscopy, including 
chemical shift assignment (Figures S5 and S6) as previously de- 
scribed.^^^^ Our KB1 chemical shifts align extremely closely with 
those of nKBI, including a diagnostic ring current-shifted Hp in 
residue P6, at -0.17 ppm (Figure S5), and near-identical devia- 
tions of Ha chemical shifts from random coil values, indicative 
of the native structure (Figure 2). We also identified slow-ex- 
changing backbone amide protons in residues C5, CI 5, T16, 
SI 8, V21, C22, T23, R24, L27 and V29 (Figures 2 and S7), as pre- 
viously observed for nKBI,^^^^ which definitively confirmed the 
native fold and characteristic hydrogen bond network. 
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Figure 2. Characterisation of l<alata Bl tlirougli NMR spectroscopy. A) A 
ribbon representation of the native KBl backbone structure, including disul- 
fide-bonded cysteines (in yellow). B) A graphical depiction of the deviation 
in Ha chemical shift from random coil values for each residue in native KBl 
(blue) and our semisynthetic KBl (orange). • denotes residues in which 
slow-exchanging amide protons have been identified (Figure S7). 



This report represents the first synthesis of a ribosomally de- 
rived circular miniprotein without the use of sortase or intein 
driven cyclization. Each of these existing protocols find wide 
use but each has limitations. Cyclization using sortase requires 
incorporation of its five-residue recognition sequence into the 
circular product, which may not be desirable. Meanwhile, in- 
teins have a tendency to spontaneously undergo in vivo splic- 
ing and folding, yielding the product in the bacterial cyto- 
plasm.^^^^ This phenomenon has facilitated in vivo activity 
screening of protein libraries,^^°^ however the inefficiency of the 
reaction results in low yields of folded protein,^^^^ and purifica- 
tion of linear precursor for in vitro cyclisation and folding is re- 
quired to yield comparable quantities of protein to our report- 
ed strategy. We therefore believe that following identification 
of circular protein sequence(s) with a desired activity, a thiol- 
labile Xaa-Cys might provide a valuable and straightforward 
strategy for large-scale production, through purification of a 
stable bacterially derived precursor for chemically mediated in 
vitro cyclization and folding. Furthermore this method should 
prove more robust to difficult cases of poor solubility, since 
the required N-terminal cysteine may potentially also be liber- 
ated chemically.^^^^ Inteins and sortase, on the other hand, re- 
quire maintenance of tertiary structure for activity and are not 
compatible with chaotropic agents. We repeated the cycliza- 
tion of linear KBl in 6 m guanidinium hydrochloride, and al- 
though the reaction proceeded with decreased efficiency; we 
were pleased to observe majority conversion to the circular 
product after 72 h (Figure S8). 

Due to its cysteine-rich sequence, KBl represented a chal- 
lenging molecule for selective C-terminal N^S acyl shift and 
cyclization. Aside from the C-terminal Gly-Cys, KBl contains 
three Val-Cys and two Thr-Cys sequences. Previous studies sug- 
gested that cysteine residues preceded by p-branched residues 
such as valine or threonine are unreactive to intramolecular 
N^S acyl shift at 45 °C.^^^ Similarly, intein-mediated synthesis of 
KBl has proved most successful when this glycine, rather than 
valine or threonine, is positioned adjacent to the active cys- 
teine residue of the intein.^^^^ Consistent with these observa- 
tions, we observed selective reactivity at the C terminus. 

In conclusion, we report a versatile new strategy for the 
semi-synthesis of circular proteins that complements the few 
existing methodologies. The appendage of cysteine to the N 
and C termini represents a remarkably simple route to mole- 
cules for which both in vivo (i.e., endogenous) and established 
in vitro biosynthesis is rather complex.^^°^ Here the sequence 
has been circularly permuted to position cysteine and glycine 
at the circularization junction. In cases where cysteine is not 
present, or where it is not present in the desired context (i.e., 
preceded by Cys/His/Gly) then a non-native Xaa-Cys junction 
may need to be introduced. However the location of this junc- 
tion can be specifically designed to avoid active sites or surfa- 
ces, or desulfurization of the non-native cysteine could be per- 
formed if necessary.^^^^ That the circular product appears stable 
to the reverse reaction (i.e., linearization) is an intriguing 
aspect of this methodology, and is currently under investiga- 
tion in our laboratory. 
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